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ABSTRACT 

Many bacteriophage and prophage genomes encode 
an HNH endonuclease (HNHE) next to their cohesive 
end site and terminase genes. The HNH catalytic 
domain contains the conserved catalytic residues 
His-Asn-His and a zinc-binding site [CxxC] 2 . An add- 
itional zinc ribbon (ZR) domain with one to two 
zinc-binding sites ([CxxxxC], [CxxxxH], [CxxxC], 
[HxxxH], [CxxC] or [CxxH]) is frequently found at 
the N-terminus or C-terminus of the HNHE or a ZR 
domain protein (ZRP) located adjacent to the HNHE. 
We expressed and purified 10 such HNHEs and 
characterized their cleavage sites. These HNHEs 
are site-specific and strand-specific nicking endo- 
nucleases (NEase or nickase) with 3- to 7-bp 
specificities. A minimal HNH nicking domain of 76 
amino acid residues was identified from Bacillus 
phage y HNHE and subsequently fused to a zinc 
finger protein to generate a chimeric NEase with a 
new specificity (12-13 bp). The identification of a 
large pool of previously unknown natural NEases 
and engineered NEases provides more 'tools' for 
DNA manipulation and molecular diagnostics. The 
small modular HNH nicking domain can be used to 
generate rare NEases applicable to targeted genome 
editing. In addition, the engineered ZF nickase is 
useful for evaluation of off-target sites in vitro 
before performing cell-based gene modification. 

INTRODUCTION 

A large number of restriction endonucleases (REases) 
have been found in bacteria and viruses and are widely 
used in recombinant DNA technology (1-5). In contrast, 
relatively few natural nicking endonucleases (NEases or 
nickase), which introduce single-strand breaks in 
double-stranded DNA (dsDNA), have been discovered 
(3,6-8). NEases were engineered from type IIS REases 



by mutating one of the two catalytic sites or disrupting 
the dimerization domain (9-11), by alteration of binding 
specificity of a type IIP REase (12) or by combining a 
DNA binding-deficient and catalytic-proficient Fokl 
subunit with a DNA binding-proficient and catalytically 
inert Fokl subunit (13). Naturally occurring nicking 
enzymes may be found as part of heterodimeric REases 
or as stand-alone enzymes. One important group of 
natural NEases are replication initiation proteins that spe- 
cifically nick the viral genome or conjugative plasmid 
during phage DNA replication or conjugative plasmid 
transfer such as gpll of fl phage (14) and Salmonella 
typhimurium relaxase encoded by plasmid pCUl (15). 
Another group of natural NEases are homing nicking 
endonucleases that play a role in gene conversion (inser- 
tion of intron and intron-encoded homing endonuclease 
into an intron-less allele) [reviewed in (16-18)]. Artificial 
zinc finger nuclease (ZFN)-based nicking enzymes 
(ZF nickases) have been constructed by fusion of three 
to four ZF arrays with Fokl cleavage domains, one cata- 
lytic proficient (FokIR + ) and the other catalytic deficient 
(FokIR - ). Transient heterodimer formation by R + /R~ 
cleavage domains upon binding to the target sites with a 
5- to 6-bp spacer leads to DNA nicking which stimulates 
targeted gene disruption or gene addition. Such nick- 
induced gene targeting displays a low frequency of inser- 
tion or deletion, in sharp contrast to the non-homologous 
end-joining process that follows a dsDNA break and 
repair response (19-21). Thus, it is desirable to use ZF 
nickases to introduce nicks to initiate a DNA repair 
response that results in homology-directed recombination 
(HDR) for genome editing and targeted gene correction. 

In addition to the ZF nickases used in genome editing, a 
nicking variant was engineered from I-Anil homing endo- 
nuclease (HEase) and its utility was demonstrated in 
nick-induced gene correction via HDR in human cells 
(22). Furthermore, it was shown that an I-Anil nicking 
variant induces homologous recombination with both 
plasmid and adeno-associated virus (AAV) vector tem- 
plates and the nicking variant alleviates the in vivo 
toxicity problem conferred by the dsDNA cleaving 
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enzyme (23). A strand-specific nicking variant has also 
been isolated from I-Scel HEase (24). The inherent disad- 
vantage of using HEases in genome editing is that most 
of the HEases studied so far tolerate degenerate sequences, 
and off-target sites with a few base pair mismatches are 
also cleaved (25,26). ZF recombinases (ZFRs) have also 
been constructed by fusion of ZF arrays with Tyr or Ser 
recombinase for ZFR-mediated integration of the donor 
plasmid in mammalian genomes and site-specific recom- 
bination in a bacterial genome (27,28). 

The HNH superfamily nucleases include HEases, 
REases, structure- specific endonucleases, non-specific nu- 
cleases, CRISPR-associated protein Cas9 (CRISPR, clus- 
tered regularly interspaced short palindromic repeat) and 
DNA repair enzymes. The invariant His residue in the 
conserved motif HNH (sometimes found as HNK or 
HNN) serves as the general base that activates a water 
molecule for a nucleophilic attack on the sugar phosphate 
backbone of nucleic acids. We recently investigated the re- 
striction-modification (R-M) systems in the sequenced 
genome Bacillus cereus ATCC 10978 (GenBank: 
AE017194) and found a prophage-encoded multi- specificity 
C5 methyltransferase (MTase) and a ParB-MTase fusion 
protein with non-specific DNA nicking activity (29). In 
the vicinity of the prophage-encoded C5 MTase, three 
genes were predicted to encode DNA-cleaving enzymes 
(genes in the order of Vrr_Nuclease, DNA Integrase, 
HNH endonuclease (HNHE), Terminase S subunit, 
ParB-MTase, C5 MTase). Here, we report the DNA 
nicking activity of the Bacillus cereus HNHE. In addition, 
we identified and characterized nine HNHE homologs and 
their nicking sites. We defined a minimal nicking domain of 
76 amino acids (aa) from phage y HNHE and fused it to 
the zinc finger protein (ZFP) (Zif268) to generate a rare 
nicking enzyme with 12- to 13-bp specificity. 
Furthermore, we examined a number of off-target sites in 
X DNA for the engineered NEase. The significance and 
possible biological function of this large family of 
strand-specific DNA NEases is discussed. 

MATERIALS AND METHODS 

Cloning and enzyme purification 

Restriction and modification enzymes, cloning/expression 
vectors and chitin beads were provided by NEB. 
Synthetic genes encoding HNHEs with optimized 
Escherichia coli expression codons were purchased from 
IDT and subcloned into pTYBl (Ndel-Xhol insert). To 
construct Zif268 and N.cpGamma fusion, a DNA 
fragment encoding the minimal N.cpGamma catalytic 
domain (F5, 76 aa) flanked by Nhel and Xhol sites 
was inserted into pTYBl. The Zif268-encoding PCR 
fragment flanked by Ndel and Nhel sites was inserted 
into the pTYBl -N.cpGamma (F5). In this way, the 
reverse primer with a variable length of linker can be 
inserted between the ZF domain and the N.cpGamma 
catalytic domain F5. IPTG induction of the 
intein-chitin-binding protein-HNHE fusions was carried 
out at 16°C overnight by addition of 0.3 mM IPTG to 
late-log-phase cells. Standard chitin column protein 



purification procedure was followed as recommended 
by the supplier. HNHEs were further purified from 
HiTrap Heparin HP columns (5 ml) on an AKTA 
FPLC purification system (GE Healthcare) and stored 
at -20°C in a storage buffer (50 mM KC1, 10 mM 
Tris-HCl, 10 mM DTT, 50% glycerol). 

Structural modeling of Zif268::N.cpGamma 
fusion endonuclease 

The model of N.cpGamma endonuclease domain was 
obtained using program I-TASSER (30). The duplex 
DNA was built and modeled together with Zif268 
(pdb id; laay) and N.cpGamma endonuclease domains 
using program Coot (31). 

RESULTS 

Identification of HNHE N.BceSVIII from B. cereus 
and its homologs 

The small predicted HNHE (121 aa) found in B. cereus 
ATCC 10987 (Supplementary Figure SI A) was expressed 
in E. coli using the IMPACT protein expression system 
(Supplementary Figure SIB). After purification, the 
protein displayed low DNA nicking activity in Mg ++ 
buffer and robust activity in Mn ++ buffer (data not 
shown) and was subsequently named N.BceSVIII. It is 
partially active in reaction buffers with Co ++ and is com- 
pletely inhibited by addition of ethylenediaminetetraacetic 
acid (EDTA) (data not shown). Run-off DNA sequencing 
of the cleavage products indicates that N.BceSVIII cuts 
frequently with the recognition sequence of 5' S^RT 3' 
(S = C/G, R = A/G) (Supplementary Figure SIC 
and D). We will use a downward pointing arrow to 
indicate nicking of the strand shown and an upward 
pointing arrow to indicate nicking of the complementary 
strand, following accepted nomenclature (32). 

A BlastP search of the GenBank NR database using the 
N.BceSVIII aa sequence as a query revealed over 200 related 
HNHEs, all similarly small (95 to ~300 aa) and predomin- 
antly encoded by phage and prophage. These HNHE 
homologs can be divided into four major groups. Group 1 
contains one to two zinc-binding sites with two to four zinc 
ribbons (ZRs) each with the motif [CxxxxC], [CxxxxH], 
[CxxxC], [HxxxH], [CxxH] or [CxxC] at the N-terminus, 
and the HNHE catalytic domain with one zinc-binding 
site with the motif [CxxCb located at the C-terminus (each 
zinc ion is presumably coordinated by a tetrahedral 
geometry and thus requires two ZRs). Group 2 reverses 
the domain organization of the Group 1 enzymes. Group 
3 lacks the N-terminal zinc-binding region of Group 1 
and contains only one zinc-binding site in the HNHE cata- 
lytic domain. A non-specific HNHE (gp74) with the Group 3 
domain architecture has been characterized recently from a 
X-related phage HK97 (33). Group 4 HNHEs are similar to 
Groups 1 and 2, but located side-by-side with another 
ZR domain protein (ZRP) either immediate upstream or 
downstream. Some Group 4 enzymes are fused to the 
adjacent ZRP (see, e.g. Bthl93 HNHE below, which has 
four putative zinc-binding sites). Some putative HNHEs 
carry additional functional domains such as a 
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5mC-recognition domain (MspJI family, McrA family) 
(34) (S.-Y. Xu, S. H. Chan and Y. Zheng, unpublished 
results), an NTPase domain (AAA superfamily, Walker 
motif A), a metalloprotease (MPN) domain or a domain of 
unknown function such as DUF222 (pfam02720). This 
report focuses on HNHEs in Groups 1, 2 and 4. 

Characterization of Bacillus anthracis phage 
y HNH nickase 

One HNHE homolog (accession YP_338236, 127 aa) from 
Group 1 was found in the genome of B. anthracis phage y 
(accession NC_007458) (35), and which we will refer to as 
phage y HNHE (N.cpGamma or N.phiGamma). Its sec- 
ondary structure as predicted by the Phyre server is shown 
in Supplementary Figure S2A (36). A synthetic gene was 
expressed in E. coli and the gene product purified by chro- 
matography through two columns (Figure 1A). 
N.cpGamma is active in the presence of Mg ++ , Mn ++ or 
Co ++ , partially active in the presence of Ca ++ , Ni ++ or 
Zn ++ and inactive in the presence of EDTA (Figure IB). 
Run-off sequencing of the cleavage products indicates that 
N.cpGamma nicks predominantly at CG^GT sites in the 
presence of Mg ++ (Figure 1C and D). A small amount of 
linear DNA was also detected in Mg ++ buffer, most likely 
due to the nicking of two closely positioned sites on 
opposite strands. The sequence specificity is reduced to 
2-3 bp recognition in Mn ++ buffer as determined by 
run-off sequencing of the cleavage products (data not 
shown). Using an independent method, we digested bac- 
teriophage k DNA in the presence of Mn ++ and cloned 
digestion products after performing a blunting reaction. 
Sequencing of the cloned fragments also revealed a relaxed 
specificity with nicking occurring at sites with one to two 
mismatches relative to the cognate site (data not shown). 

Heat treatment of ~5 U of N.c|)Gamma at 80°C for 20 
min completely abolished its activity, whereas treatment at 
lower temperatures (55-75°C) resulted in only partial loss 
of nicking activity. Furthermore, treatment at 80°C for 
only 10 min was not sufficient for complete inactivation 
(data not shown). 

Mapping the minimally functional nicking domain 
in N.cpGamma 

Catalytic domains of several type IIS REases including 
Fokl, Bmrl and BpuJI when cloned and purified in 
absence of the DNA binding domains have been shown 
to display non-specific nuclease activity (37-39). To test 
whether the N.(pGamma catalytic domain (the HNH 
domain) displays a similar non-specific activity in 
absence of the ZR domain, we deleted 66 aa (A66 aa) 
residues from the N-terminus of the wild-type (wt) 
enzyme, resulting in a mutant containing only the 61 
C-terminal aa residues (F6). Additional deletion mutants 
were also constructed, removing 22 aa (F3), 43 aa (F4) 
and 51 aa (F5) from the N-terminus, respectively 
(Supplementary Figure S2A). The deletion variants were 
purified and their nicking activity was assayed in Mg ++ or 
Mn ++ buffer. Truncated mutants F3 and F4 displayed 
slightly lower activity in Mn ++ buffers compared with 
the full-length enzyme. Mutant F5 showed nicking 



activity in Mn ++ buffer and low activity in Mg ++ buffer 
(Supplementary Figure S2B and C), and mutant F6 has 
minimal nicking activity in Mg ++ or Mn ++ buffers. The 
nicking specificities of F4 and F5 are nearly identical: both 
deletion variants nick the cognate sites (CG^GT) or vari- 
ations thereof with one base mismatch (F5 'star' sites 
CG4,GG, CG4,GC, CA4,GT or CG4,AT; data not shown). 
It was concluded that the 6 1 -aa C-terminal HNHE domain is 
not sufficient to constitute an active endonuclease activity. 
The smaller deletion variant F5 (76 aa in length) possesses 
the necessary elements for a sequence-specific NEase 
despite its lower activity and relaxed specificity (also see 
N.(pGamma-F5 and Zif268 fusion below). 

Enzyme concentration dependence in nicking reactions, 
DNA nicking kinetics in a time course by N.cpGamma 
and flanking sequence effect on nicking efficiency 

Partially purified N.cpGamma and a deletion variant F3 
were used to nick pBR322 DNA. At low enzyme concen- 
tration, supercoiled DNA was converted to nicked 
circular form. At high enzyme concentration (2^1 x), 
however, a small fraction of DNA was converted to 
linear form, probably generated from nicks introduced 
on the opposite strands at adjacent sites (Supplementary 
Figure S2D). Supplementary Figure S2E shows a time 
course in N.cpGamma-mediated nicking reactions. The 
substrate DNA (pBR322) was incubated with diluted 
N.cpGamma from 2 min to 2 h in a limited digestion. 
Supercoiled DNA was partially converted to nicked 
circular DNA in the time period. No linear DNA was 
detected due to the limited digestion. In order to 
examine the nicking efficiency of various ACCG sites in 
pBR322, the DNA substrate was incubated with purified 
N.cpGamma at 37°C for 2h and then subjected to run-off 
sequencing. Supplementary Figure S2F shows a few 
examples of the nicked ACfCG sites followed by A, G, 
C or T. Overall, the DNA sites ACCGR appeared to be 
nicked more efficiently than ACCGY (ACCGR>ACCGC 
>ACCGT) by comparing the 'A' peak in double peaks at 
the run-off sites. A quantitative nicking assay to evaluate 
the nicking efficiency of all possible flanking sequences 
(N'gN'aN'rACCG-NiNaNa) remains to be developed. 

Nicking short DNA duplex oligos by N.cpGamma 

DNA duplex oligos with one ACCGG site (28-mer) or two 
ACCGG sites (32-mer) were digested by N.cpGamma at 
37°C for lh in either Mg ++ or Mn ++ buffer and the 
cleavage products were resolved in a denaturing gel. 
Both substrates were nicked and new products were 
detected (data not shown). In support of this, 
3' FAM-labeled duplex oligos with one ACCGN site 
(N = A, G, C or T) were also nicked by N.cpGamma. 
Supplementary Figure S2G shows the nicking results of 
ACCGA, ACCGG, ACCGC and ACCGT duplex oligos 
at four enzyme concentrations. Consistent with the 
nicking site efficiency in pBR322, the 28-bp short oligos 
with ACCGR sites were nicked more readily than 
ACCGY sites. The nicking reactions nearly reached com- 
pletion at high enzyme concentration for ACCGR oligos, 
whereas partial nicking was achieved using the ACCGY 
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Figure 1. Characterization of phage y HNHE (N.tpGamma). (A) SDS-PAGE analysis of the partially purified N.tpGamma (indicated by an arrow, 
predicted molecular mass of 15.5 kDa). (B) DNA cleavage assay on pBR322 in various buffers. NEB buffers B1-B4 all contain 10 mM MgCl 2 . Buffer 
without divalent cations (10 mM Tris-HCl, pH 7.5, 50 mM NaCl, 1 mM DTT) was supplemented with metal ions (5 mM) as indicated above each 
lane. (C) N.<pGamma nicking consensus sequence CGJ,GT (ACfCG) was compiled by WebLogo (http://weblogo.berkeley.edu/logo.cgi). (D) Run-off 
sequencing of two nicked sites. The intervening sequences have been shortened to show the two sites nicked at the opposite strands. Double peaks 
(A/C, A/G, A/T) indicate a nick on the template strand as the extrinsic A was added by Taq DNA polymerase by the template-independent terminal 
nucleotide transferase activity. Undigested pBR322 was used as a control in sequencing (top two chromatograms). 



substrates. Although the exact size of the nicked products 
remains to be determined by running a sequencing gel at 
the single nucleotide resolution, it suffices to say that short 
DNA duplex oligos can serve as the substrates for nicking 
reactions by N.cpGamma. 

Homologs of phage gamma HNHE (N.cpGamma) 

A BlastP search in GenBank revealed that Bacillus phage 
Cherry and WBeta genomes encode a putative HNHE 
with identical aa sequence to N.cpGamma (data not 
shown). N.cpGamma also shares 99% aa sequence 
identity to a putative HNHE (protein ID: EJR41403, 
129 aa, probably encoded by a prophage belonging to 
HK97 family phage) in the genome of B. cereus VD045 
strain, and 91% aa sequence identity to a putative HNHE 
(protein ID: YP_001375827, 127 aa, probably encoded by 
a prophage similar to P27 family phage, and located next 



to open reading frames (ORFs) coding for terminase small 
and large subunits) in the genome of Bacillus cytotoxicus 
NVH 391-98 strain. The next group of putative HNHEs 
from Geobacillus phage or prophage shares 51-56% aa 
sequence identity to that of N.cpGamma. Because of the 
high aa sequence identity (similarity) among these 
HNHEs, it is highly likely that they are nicking enzymes 
with nicking site CG^GT (ACfCG) or minor variations 
of such a sequence. Consistent with this prediction, the 
HNHE encoded by Geobacillus virus E2 (N.cpE2), which 
shares 51% aa sequence identity to N.cpGamma, dis- 
plays the nicking specificity of CG^GT (data not shown) 
(Table 1). 

Lactobacillus phage Sal2 HNH nicking endonuclease 

An HNHE homolog was found in the Lactobacillus phage 
Sal2 with additional aa blocks located upstream and 
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Table 1. Summary of DNA nicking sites by phage- or prophage-encoded HNHE and engineered chimeric nicking enzyme 
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1- to 2-bp mismatches 
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ND, not determined. DNA single letter code, W: A/T; S: G/C; R, A/G; Y: C/T; B: C/G/T (not A); D, A/G/T (not C); V, A/C/G (not T); H, A/C/T 
(not G). 

?, need more experimental evidence. 

Underlined target sites with 4-bp variations (ACCG, ACCR, ACCV, ACCS, RTCG, AGCA, ACYG) are shared by these HNHEs. 
Proposed nomenclature: HNH nicking enzymes encoded by phage: N.((>+phage name; NEases encoded by prophage or host genome (5,32). 



downstream of the HNH catalytic domain (protein ID: 
YP_535176), which may confer additional base recogni- 
tion (40). Supplementary Figure S3A shows that the two 
additional blocks of aa sequences (5-aa and 11 -aa 
residues) at the N-terminus and one extra block of 32-aa 
residues located at the C-terminal end compared with 
phage N.cpGamma. Phage Sal2 HNHE (N.cpSal2 or 
N.phiSal2) was purified and used to digest pBR322. 
N.c|)Sal2 shows robust nicking activity in Mn ++ buffer, 
which efficiently nicks DNA sites TG^CTC (GAGtCA) 
and related sites with 1- to 2-bp mismatches from the core 
sequence such as CG^CTC or TG^TTC (Figure 2A, and 
data not shown). Supplementary Figure S3B compiled the 
nicking sites (GARCA or its variants) by WebLogo using 
pBR322 digested in Mn ++ buffer. N.<j>Sal2 partially nicks 
pBR322 in the presence of Mg ++ (data not shown). 
Figure 2B shows the five nicking sites, which spans 
14-bp DNA with a core recognition site of GAGCA 
DNW (SSN 4 GAGCAD NW) (S, C/G; D, A/G/T, not C; 
W, A/T). The sixth base of the core is intolerant of a C 
because no GAGCAC-related sites were nicked (data not 
shown). In summary, N.c|)Sal2 shows optimal nicking 
activity in Mn ++ buffer, nicking at TG^CTC 
(GAGtCA) or variant sites with 1- to 2-bp mismatches. 

HNHE Bthl93 from of Bacillus thuringensis 
strain T13001 

We identified another N.cpGamma homolog encoded on a 
prophage in the genome of B. thuringensis strain T 13001 
(protein ID: ZP_041 18662). This enzyme, named as 
N.BthT13001I (a.k.a. Bthl93 HNHE), has an additional 
N-terminal region relative to N.cpGamma containing four 
ZR motifs (Supplementary Figure S4A). N.BthT13001I 
was expressed in E. coli and partially purified. It nicks 
DNA sequences GG^GT or CG^GT in Mn ++ buffer as 
determined by run-off sequencing (Supplementary Figure 



S4B and D, data not shown). The nicking sites in pBR322 
were compiled by WebLogo as SGJ,GT (Supplementary 
Figure S4B). In Mg ++ buffer, however, N.BthT13001I rec- 
ognizes and nicks DNA sites with a core sequence of 
CSG^GT (ACtCSG) (Supplementary Figure S4C). 
Based on the high sequence similarity with N.cpGamma, 
we tentatively concluded that the HNH catalytic domain 
recognizes the SGGT sequence, and the additional ZRs at 
the N-terminus might contribute to the extra base 
C recognition in CSGGT. 

We also tested whether the HNH catalytic domain is 
interchangeable between N.BthT13001I and 
N.cpGamma. The N-terminus ZR of N.BthT13001I was 
fused with the 76-aa N.cpGamma catalytic domain (F5) 
to form a chimeric protein (the arrow in Supplementary 
Figure S4A indicates the fusion junction). The nicking 
sites of the fusion NEase are nearly identical to that of 
N.BthT13001I, but with preference for CCG^GT 
(ACtCGG) sequence, further confirming the 'C base rec- 
ognition conferred by the extra ZRs (data not shown). We 
concluded that the HNH catalytic domain is readily 
exchangeable between Bthl93 and N.cpGamma and the 
chimeric enzyme N-N.BthT13001I::C-N.cpGamma (F5) 
prefers CCG^GT sites for nicking. 

HNH nicking endonuclease gp54 from Lactobacillus 
phage Lrml and other HNHEs 

In the sequenced genome of Lactobacillus phage Lrml 
(phage genome accession number: EU246945) (41), the 
last gene (gp54, protein ID: ABY84355) was annotated 
as a terminase S subunit. However, two other genes 
encoding terminase small and large subunits are 
located at the left arm of the phage genome 
(Supplementary Figure S5A). This predicted protein 
(gp54, 264-aa long) contains a HNH catalytic domain 
near the N-terminus (H 7 7-N 9 3-H 10 2) and a AAA-NTPase 
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Figure 2. Characterization of phage Sal2 HNHE (N.4>Sal2). (A) Run-off sequencing of two nicked sites TGJ.CTC (GAGfCA) in pBR322 which was 
digested by N.<|>Sal2 in Mn ++ buffer. (B) Nicking sites in pBR322 (Mg ++ buffer) compiled by WebLogo. 



domain (also termed Walker A motif GxxxxGK[S/T]), but 
lacking the Walker B motif: h4[D/E] (h4 = four hydro- 
phobic aa residues) at the C-terminus. In addition, 
the C-terminus contains a putative zinc-binding motif 
(2xZR [HxxxH], [RxxxH] or [HxxxD]; and seven other 
Lactobacillus phage encode close homologs carrying 
the 2xZR motif [HxxxH] 2 ) (data not shown). Phage 
Lrml HNHE (N.c|)Lrml or N.phiLrml) was expressed 
in E. coli, partially purified by affinity chromatography 
through a chitin column, and used to digest pBR322 
DNA; It displays DNA nicking activity in Mn ++ buffer 
and low activity in Mg ++ buffer (Supplementary 
Figure S5B and C). Addition of NTP (ATP, GTP, CTP 



or UTP) failed to stimulate N.tj)Lrml nicking activity 
(data not shown). The partially nicked DNA was 
gel-purified and subjected to run-off sequencing. 
Supplementary Figure S5D summarizes the nicking sites 
as HSSGJ,GT (ACfCSSD), which resembled the nicking 
sites of Bthl93 CSG^GT (ACfCSG). We do not know 
the function of the AAA-superfamily domain (Walker A 
motif) at the C-terminus of N.(j)Lrml as neither NTP nor 
dNTP have any stimulatory effect on DNA nicking 
activity. This domain could serve as an accessory 
domain interacting with DNA or other proteins. We 
have not studied the interaction of this HNHE with the 
phage Lrml terminase. 
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We also partially purified the HNHEs from Bacillus 
phage phi 105 (protein ID: ADF59184), Geobacillus virus 
E2 HNHE (GBVE2_gp070, protein ID: YPJ301522898) 
(42), Clostridium phage phi3626, gp50 (protein ID: 
NP_6 12879), Staphylococcus aureus strain Y74T 
prophage, ORF Sap040a_009 (protein ID: ACZ58991) 
(data not shown). The nicking sites of these HNHEs are 
summarized in Table 1. 

Construction of Zif268 and N.cpGamma (F5, 76 aa) 
fusion nicking endonuclease 

A large number of natural ZFPs have been identified in 
the past 25 years consisting primarily of eukaryotic tran- 
scription factors (see ZF database) (43), and additional 
ZFPs with high DNA binding affinity have been engin- 
eered by genetic selection or phage display (44^17). Thus, 
ZFPs provide a collection of DNA binding modules 
(arrays) ranging from 6 to 12 bp with two to four individ- 
ual ZFs linked together. ZFNs have been engineered to 
cleave large recognition sequences 12-24 bp (48-52). 
Individual ZFP forms a DNA recognition module with 
two antiparallel P-sheets and one a-helix (PPaZn), where 
the second p-sheet contacts the sugar phosphate backbone 
and the a-helix contacts the base pairs in the DNA major 
groove as first revealed by the structure of the Zif268- 
DNA complex (53). 

We next attempted the fusion of the minimal 
N.cpGamma nicking domain (F5, 76 aa) to Zif268 
(3 ZFs, 117 aa) (partial ZFP sequence of North 
American songbird Vireo cassinii, GenBank accession: 
protein ID: AAO84760, 210 aa) to generate the chimeric 
enzyme Zif268III:: N.cpGamma F5 (designated as Ziflll 
for short). Two fusion proteins were constructed, Rl 
fusion with 13-aa linker and R2 fusion with 3-aa linker 
between the two functional domains (Figure 3A). The two 
fusion proteins were purified to near homogeneity by 
chromatography (Figure 3B) and used to digest pUC19 
(Figure 3C) and pUC19-Zif3 (containing a Zif268 recog- 
nition sequence GCGGGGGCG). Run-off sequencing of 
the nicked products indicated no apparent nicking taking 
place within or near the GCGGGGGCG sequence (data 
not shown). However, a clear nicking site was detected at 
ATtCG -N 6 - GCG C GGG GA (underlined base pair 
matching N.cpGamma and Zif268 recognition sequences 
with a 6-bp spacer) in pUC19 and pUC19-Zif3 (nicking 
the opposite strand CG^AT, data not shown). This unin- 
tended nicking site (off-target site or 'star' site) prompted 
us to design a new DNA substrate with the sequence 
ATCG-N 6 -GCGTGGGCG. Figure 3D shows a time 
course of the nicking reactions of pUC derivatives 
carrying the cognate site and two 'star' sites. Figure 3E 
shows nicking of cognate site with enzyme dependence 
over fixed amount of DNA. A sharp run-off signal was 
detected on the Ziflll Rl -digested cognate site and nicking 
at the 'star' site of the pUC backbone was strongly sup- 
pressed although the chromatogram peak is low at the 
'star' site (Figure 3F, third sequencing run). A small 
fraction of the DNA was nicked on the opposite strand 
2 nt apart generating a 2-base 5'-overhang (Figure 3F, first 
sequencing run), suggesting the reason for the small 



amount of linear DNA detected in the agarose gel in 
Figure 3E and D. For the pUC derivative with one 'star' 
site inserted at the multiple cloning sites and the original 
'star' site in the backbone, both sites were partially nicked 
(Figure 3F, fourth sequencing run). Nicking at ATCG, 
ACCG or ACTG sites were not detected (one undigested 
ATCG site is indicated by a blue line in Figure 3F, data 
not shown). Nicking reactions performed in high salt 
buffer (B3) could minimize the promiscuous nicking 
activity as some smearing was detected in prolonged incu- 
bation in Bl, B2 and B4 (data not shown). 
Double-stranded breaks could be minimized by shorter 
digestion (<30min) and in 150mM NaCl (~80% super- 
coiled DNA converted to nicked circular in 2 h in 150mM 
NaCl with minimal dsDNA breaks, data not shown). 
These results indicate that a ZF nickase can be con- 
structed from a ZFP (three fingers) and the N.cpGamma 
minimal nicking domain to create a chimera capable of 
nicking a 12- to 13-bp target site. Addition of two or more 
ZFs may further increase the nicking specificity, which is a 
prerequisite for applications in genome editing. 

Off-target nicking sites by Zif268::N.cpGamma F5 

To investigate potential 'star' sites of Ziflll Rl, X DNA 
was nicked by the chimeric NEase and the digested DNA 
was subjected to run-off sequencing at the suspected 'star' 
sites (9- to 11-bp matches). One strong off-target nicking 
site was found with the sequence ACtTG-N 6 - 
GCCGTGGAG (CA^GT, 2-bp matches in each ZF 
binding site, X coordinate 3981, Supplementary Figure 
S6A). Two 'star' sites with the sequence ACtCC-N 6 - 
GCG C GGG TT (GG^GT; 3-bp, 2-bp, 1-bp matches to 
ZF3, ZF2 and ZF1 sites, respectively; X coordinate 4486) 
and ACTG-Nf,- GCGGGGG TA (CA^GT; 3-bp, 3-bp, 
1-bp matches in ZF3, ZF2 and ZF1 sites, respectively; 
X coordinate 11111) were partially nicked by the Ziflll 
Rl NEase (Supplementary Figure S6B, data not shown). 
No apparent nicking was found in the suspected 'star' sites 
listed in Supplementary Figure S6C, which contains 5- to 
7-bp matches to the Zif268 binding site (data not shown). 
These off-target sites indicate that 2-bp matches in each 
ZF are likely strong 'star' sites. The 5- to 6-bp matches in 
ZF3 and ZF2 in conjunction with 1-bp match in ZF1 are 
probable 'star' sites. 

DISCUSSION 

The minimal HNH catalytic domain with its own 
sequence specificity 

The HNH motif is highly conserved among the HNHEs, 
although in some cases the second His residue can be 
substituted by a Lys or Asn residue (H-N-K/N). The 
first conserved His residue acts as a general base to 
activate a water molecule for nucleophilic attack of the 
DNA phosphodiester bond, and the Asn orients the cata- 
lytic His residue and scissile phosphate in the correct 
position for DNA hydrolysis (54,55). The second His (or 
Lys/Asn) stabilizes the leaving group. HNH domains, 
which generally fold into a secondary structure described 
as 'PPa— metal', bind the DNA backbone in the minor 
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Figure 3. Characterization of Zif268::N.(pGamma F5 chimeric nickase (Ziflll). (A) Amino acid sequence of the fusion HNHE (Zif268 sequence 
shown in blue). The sequence was derived from partial ZFP of Vireo cassinii (GenBank: gb AAO84760). Two fusion constructs were made, Rl with a 
13-aa linker (7 aa: DKKAEKA from the native songbird Zif268) and R2 with 3-aa linker between the two functional domains. (B) SDS-PAGE 
analysis of purified Ziflll Rl and R2 HNHEs. Lanes 2-5, 6 and 7, purified Rl and R2 fusion proteins from chitin columns. Lanes 8-11, purified Rl 
fusion protein from a heparin column (HiTrap Heparin HP, 5 ml). (C) Nicked pUC19 DNA in a time course (lanes 2 and 6, 20min to 2h). Lanes 
7 and 8, Nt.BspQI nicked and Sphl-digested DNA, respectively. (D) Digestion of pUC-ATCG-N 6 -Zif3(GCGTGGGCG) and pUC-ATCG-N 6 - 
Zif3star (GCGCGGGGA) in a time course (20min to 2h) in Buffer 3 (100 mM NaCl). Small amount of linear DNA (<5% of total DNA) 
appeared after 20min digestion. Note: there is a pre-existing 'star' site in pUC19 in addition to the cognate site or 'star' site inserted at the 
multiple cloning sites. (E) Digestion of pUC-ATCG-N 6 -Zif3(GCGTGGGCG) by varying concentration of Ziflll Rl in NEB Buffer 3 for 30min. 
The Nt.BspQI-nicked DNA was used as a marker. The slow migrating DNA in (E) and (F) is probably nicked dimer. (F) Run-off sequencing of 
nicked DNA (B3, 1 h, 37°C) with forward and reverse primers. Red line, Zif268 cognate site; green line: Zif268 'star' site; blue line: N.fpGamma site. 
Arrows indicate the nicking sites where double peaks appeared. SC, supercoiled; L, linear; NC, nicked circular DNA. 
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groove, whereas additional residues mediate sequence- 
specific interactions with the DNA (56). The zinc metal 
ion is coordinated by a tetrahedral coordinating geometry 
by the two ZRs [CxxC] 2 or [CxxH, CxxC] (55,57,58). 

Secondary structure prediction by the Phyre server (59) 
revealed that the N.cpGamma catalytic domain (F6, 66 aa) 
contains a p— (3— a— a structure, as predicted from known 
structures of a HNHE from Geobacter metallireducens 
GS-15 (a HNHE without the extra ZR domain, pdb 
accession: 2qgpB, Northeast Structural Genomics 
Consortium target GmR87) (A. P. Kuzin et al., unpub- 
lished results) and Pad (a HNH family REase) (57). The 
active deletion variant F5 (76 aa) contains an extra 
oc-helical structure that may be involved in increased 
DNA-binding affinity or dimerization. It is somewhat un- 
expected that the deletion of the N-terminal 51-aa ZR 
domain does not totally abolish the specificity of 
N.cpGamma. The deletion only weakened the activity in 
Mg ++ buffer and relaxed the sequence recognition. The 
76-aa minimal nicking domain has been used in construc- 
tion of fusion protein with ZFP. The 'star' sites of 
N.(pGamma in Mn ++ are usually one base different from 
the cognate site ACfCG, e.g. ATfCG, CCfCG, ACfTG 
or GCtCG. The off-target sites of Zif268::N.cpGamma in 
k DNA, however, appear to be AYfYS (e.g. ACfCC, 
ACfTG, ATfCG). This limited spectrum of 'star' sites 
could be result of few 'star' sites interrogated or lower 
'star' activity in the high salt buffer. 

We constructed two versions of Zif268::N.(pGamma 
fusion, Rl with 13-aa linker and R2 with 3-aa linker 
between the ZFs and N.cpGamma catalytic domains. Rl 
is more active than R2 in DNA nicking and both require a 
N 6 DNA spacer between the ZFP binding site and the 
N.cpGamma target sequence. More Zif268 and 
N.cpGamma fusions with variable linker length ranging 
from 4 to 20 aa are probably necessary to pinpoint the 
optimal spacing between the two functional domains. The 
optimal linker between ZF arrays and Fokl cleavage 
domain appears to be 4-8 aa residues for efficient 
cleavage of target sites with a 6-bp spacer by ZFNs (60). 
To better understand how N.cpGamma is likely positioned 
in the chimeric structure, we built three-dimensional 
homology-based models of DNA-bound and the apo 
Zif268::N.cpGamma (Ziflll Rl). As shown in 
Supplementary Figure S7, this model (monomer) adopts 
an elongated shape and covers roughly two turns of the 
DNA helix. 

The ZF nickase by fusion of a ZF arrays and a Fokl 
cleavage domain requires dimerization in order to nick the 
target sites, i.e. formation of a heterodimer by Fokl 
catalytic-proficient (R + ) and catalytic-deficient (R~) 
monomers. Similarly, TALE-Fokl nickase (to be con- 
structed and functionally tested) may also require dimer 
formation (TALE-FokIR + and TALE-FokIR") in order 
to nick target sites. The advantages of using a HNH 
nicking domain are: (i) a single molecule construction, 
and thus the cost of cloning, expression and purification 
may be cut in half, (ii) the wide selection of natural HNH 
nicking domains existing among the phage and 
prophage-encoded HNHE and (iii) the minimally func- 
tional nicking domain is fairly soluble in fusion with 



other DNA binding partners. The disadvantage of using 
the N.cpGamma minimal nicking domain is that it carries 
its own specificity. By site-directed mutagenesis, however, 
its specificity could be further reduced. Alternatively, one 
can use the nicking domain from N.BceSVIII with 
frequent nicking sites (S^RT, ~2 bp) for construction of 
chimeric nickases. Our initial study using the HNH 
nicking domain from I-Hmul indicated that it has a 
strong non-specific nicking activity in the absence of a 
DNA binding partner and therefore mutations need to 
be introduced to further reduce its affinity for DNA 
(S. H. Chan and S.-Y. Xu, unpublished results). 

A ZR domain isolated from the Nobl RNA endonucle- 
ase of the archaeon Pyrococcus horikoshiiis is sufficient to 
bind an RNA helix 40 of the small subunit ribosomal 
RNA (rRNA) (61), indicating the ZR domain has its 
own specificity. ZRs have been widely adopted by other 
nucleases and DNA/RNA metabolic enzymes and 
transcription factors (62-65). The ZR found in McrA 
endonuclease, a member of the HNHE superfamily, had 
been studied previously by computer modeling and struc- 
ture prediction (34). 

Metal ion requirement of HNHEs 

A few HNHEs are only active in Mn ++ or Co ++ buffers, 
including N.BceSVIII, Lactobacillus phage Lrml gp54 
(N.cpLrml), S. aureus prophage ORF Sap040a_009 
HNHE. N.BceSVIII and phage N.cpGamma are active 
in a range of 1-10 mM Mn ++ tested. Gp54 of 
Lactobacillus phage Lrml (N.cpLrml), however, prefers 
a low concentration of Mn ++ (1 mM) in nicking. Thus, 
the optimal Mn ++ concentration supporting maximum 
endonuclease activity depends on the individual enzyme. 
The non-specific HNHE encoded by phage HK97 
contains the zinc-binding motif CxxC and CxxH in the 
catalytic domain. No additional zinc-binding sites (ZRs) 
were found at the N-terminus. HK97 gp74 HNHE prefers 
Ni ++ as a co-factor, although lower activity was also 
detected in other divalent cations (33,66). HK97 gp74 
HNHE also nicks plasmid DNA, but the sequence speci- 
ficity of nicking sites (if any) has not been yet determined. 
Previously, we found HpyAV (CCTTC N 6 /N 5 ), a member 
of the HNH family REase also prefers Ni ++ for optimal 
endonuclease activity (67). Interestingly, the modification- 
dependent endonuclease Sco McrA, an HNH family 
enzyme, prefers Mn ++ or Co ++ in cleaving M.Dcm- 
modified DNA (68). 

In N.cpGamma, the conserved 'Ppa-metal' fold is 
expected to display a tetrahedral coordination of a Zn 
ion for structural folding and an Mg ++ /Mn ++ divalent 
cation for catalytic function. The magnesium concentra- 
tion is estimated to be ~100mM in actively growing 
E. coli cells, although most of the Mg ++ is in a bound 
state to nucleic acids. The Mn ++ and Zn ++ ions in 
E. coli are estimated at much lower concentration, at 
~0.2-0.4mM for Mn ++ and 0.1 mM for Zn ++ ions (cells 
grown in LB broth) (Bionumbers database at the web site: 
http://bionumbers.hms.harvard.edu). Based on this esti- 
mation of metal ion concentration, it is speculated that 
Mg ++ is more readily available for the N.cpGamma 
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catalytic activity than Mn , which may explain the ob- 
servation that most of the phage/prophage-encoded 
HNHE described in this work are not toxic or lethal to 
E. coli, even though some of the HNHEs have frequent 
nicking sites, and in the absence of companion methylase 
protection (or prophage gene expression may be tightly 
repressed). Previously, we expressed Nt.CviPII (/CCD, a 
frequent nickase encoded by a chlorella virus NYs-1) in 
E. coli (Nt.CviPII requires Mg ++ for catalytic activity and 
belongs to the PDxExK family endonucleases). The 
over-expression of Nt.CviPII requires the presence of a 
cognate methylase for host DNA protection (7). 

Off-target sites of Zif268::N.cpGamma F5 fusion NEase 

The N.cpGamma catalytic domain F5 has its own specificity 
(ACCG, ACCC, ACTG or ATCG) and the Zif268 has 9-bp 
specificity (GCGTGGGCG or GCGGGGGCG). The 
fusion enzyme has the combined sequence specificities 
with a 6-bp spacer. Although we have not analyzed many 
'star' sites of Zif268::N.cpGamma F5 chimera, the follow- 
ing observation was made in the limited number of 
off-target sites in k and pUC19 DNA. For a strong 'star' 
site, 2-bp matches in each finger (6-bp matches total) are 
better than 6-bp matches in only two fingers and 0-bp 
match in one finger (6-bp matches total). The 3-bp 
matches in ZF3, 2-bp matches in ZF2 and 1-bp match in 
ZF1 (6-bp matches total) appear to be a 'star' site (e.g. 
ACtCG-N 6 - GCGSGGG NN, ACCG can be substituted 
by ACCC, ATCG or ACTG). It is possible that ZF2 and 
ZF3 are more important than ZF1 in determination of the 
nicking specificity due to the proximity to the N.(pGamma 
nicking domain or higher binding affinity to DNA. 
Sequencing more Ziflll Rl 'star' sites is necessary to 
confirm this observation. 

Fusion of N.(pGamma F5 nicking domain to other 
DNA binding elements 

In theory, the N.cpGamma minimal nicking domain can be 
fused to other DNA binding proteins such as a 



cleavage-deficient NotI REase. Because NotI forms a 
dimer, the NotI (D160N)::N.cpGamma F5 fusion may 
nick the symmetric site ACtCG-N x -GCGGCCGC-N x - 
CG^GT (estimated x = 6-12 bp). The N.cpGamma 
nicking domain may also be fused to a TALE effector 
protein, which recognizes 14-to 24-bp recognition se- 
quences. The resulting fusion protein is expected to nick 
DNA at rare sites. A number of labs have developed 
methods for the modular construction of custom-designed 
TALE effector proteins and TALE nucleases (69-76). The 
optimal linker between TALE effector protein and the 
N.cpGamma nicking domain needs to be tested experimen- 
tally in future work. The homeodomain fold commonly 
found in eukaryotic transcription factors consists of a 
60-aa helix-turn-helix structure that binds to DNA or 
RNA. A large number of new specificities of 
homeodomain have been selected that can serve as the 
target-recognizing domain (TRD) of the fusion nicking 
enzyme (77). In addition, the N.cpGamma nicking 
domain can be conjugated to LNA (locked nucleic acid) 
which binds to dsDNA by triple-helix formation. 
Similarly, the N.cpGamma nicking domain can be 
coupled to 5mC-recognition domain such as MBD 
(methyl-binding domain protein) (78) or the 5mC specifi- 
city domain of MspJI-family REases, McrB, McrA, 
SauUSI or SRA protein, generating 5mC-specific 
nickases. 

Possible biological function of phage and 
prophage-encoded HNHE 

Table 2 shows the gene organization of the phage/ 
prophage-encoded HNHE, adjacent ZR protein, 
terminase small and large subunits and phage portal 
protein. The biological function of these phage or 
prophage-encoded HNH NEases is unknown. Because 
these NEase are located near the DNA packaging 
enzyme terminase, we speculate that these NEases may 
play a role in DNA packaging, e.g. by relieving the 
supercoiled tension built up during DNA translocation 
into the phage prohead. The nicks may stimulate 



Table 2. Genes (ORFs) in proximity to the phage- or prophage-encoded HNHEs 



Phage or prophage 


ZR protein' 1 


HNHE 


ZR protein" 


Terminase S 


Terminase L 


Portal protein 


B. cereus ATCC 10978 




Bee 0389 




Bee 0390 


0397 


0398 


Bacillus phage y 


yLSU_0048 


yLSU 0050 




yLSU 0001 


0002 


0003 


Bacillus phage 105 




phi 105 00255 


phi 105 00260 


phi 105 00005 


00010 


00020 


Bacillus subtilis subspecies spizizenii W23 




Bsuw23 09635 


Bsuw23_09630 


Bsuw23_09625 


_09620 


_09610 


B. thuringensis str T13001 (Bthl93) 




Bth0005 4180 b 


unknown 








B. thuringensis str T13001 (Bth67) 


Bth0005 53510 


Bth0005 53520 




Bth0005 53530 


53540 


unknown 


Clostridium phage phi3626 


phi3626_p49 


phi3626_p50 




phi3626_p01 


_p02 


_p03 


Geobacillus virus E2 


GBVE2_gp069 


GBVE2 gp070 




GBVE2 gpOOl 


gp002 


gp003 


Lactobacillus phage Sal2 




LSL_279 




LSL_280 


281 


282 


Lactobacillus phage Lrml 


Lrml_gp52 


Lrml_gp54 




Lrml gpOl 


_gp02 


_gp03 


Staphylococcus aureus Y74T 




Sap040A_009 


Sap040A_010 


Sap040A_0011 


_012 


_014 


Prophage Sap040a 009 














Phage HK97 (AF069529) 




gp74 c 




gpl 


gp2 


gp3 



"The ZR motifs of the ZR proteins located either upstream or downstream of the HNHEs contain the amino acid sequence: CxxC, CxxxC, HxxC, 
HxxxC, or CxxxxxH; 

b There are two HNHEs in the shot-gun genome sequences of B. thuringensis str T13001. The N.BthT13001I (Bthl93) activity has been verified. The 
nicking activity of the shorter HNHE (67 aa) and the 110-aa ZR protein is unknown; 

The gp74 HNHE encoded by phage HK97 contains the zinc-binding motif CxxC and CxxH. Gp74 has nicking activity on plasmids (in Ni ++ buffer). 
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homologous recombination and enhance gene conversion 
of HNHE-minus incoming phage to HNHE-plus phage, 
which may also occur during mixed infection. HEases 
make dsDNA breaks in alleles that are homologous to 
the endonuclease gene but lack the intron or intein 
element, which encodes the gene for the HEase. One 
example of HNH NEase-stimulated trans-homing was 
reported previously: an HNH nicking enzyme (mobE) 
encoded by T4 phage introduces a strand-specific nick in 
the non-coding strand of the nrdB gene of a related phage 
T2. MobE promotes the mobility of the neighboring 
non-functional homing endonuclease I-TevIII, which 
was encoded within a Group I intron interrupting the 
nrdB gene of phage T4 (79). In this case, the nicking 
enzyme functions as a 'helper' HEase. The HNH nicking 
enzymes may play a role in the lysogenic life cycle of 
phage. It was proposed that phage HK97 HNHE, gp74 
generated dsDNA breaks (or ssDNA nicks in certain 
metal ions), thereby initiating bacterial SOS repair 
response that allows for homologous recombination at 
the cleavage site to occur, resulting in integration of the 
phage genome (66). Another possibility yet to be ruled out 
is introduction of dsDNA breaks by HNH nicking endo- 
nuclease in conjunction with another nickase such as ParB 
homologs or Vrr_nuc domain endonucleases, which 
together function as restriction enzymes to restrict 
invading DNA. 

We have not studied the function of the ZRP located 
next to the HNHE except for the Bthl93 HNHE 
(N.BthT13001I), which is a fusion of ZRP to HNHE 
with four putative zinc-binding sites (8xZRs). The 
adjacent ZRP could serve as the transcription regulator 
of the HNHE gene or as a mini-specificity subunit that 
could interact with the HNHE and further extend the 
sequence specificity. Experiments are in progress to 
express and purify these small ZRPs to elucidate their 
function and to construct chimeric fusion NEases from 
existing HNHEs by domain swapping. 
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